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Abstract 

In this paper we consider the problem of extracting secret key from an eavesdropped source 
Pxyz at a rate given by the conditional mutual information. We investigate this question under 
three different scenarios: (i) Alice (X) and Bob (Y) are unable to communicate but share com¬ 
mon randomness with the eavesdropper Eve (Z), (ii) Alice and Bob are allowed one-way public 
communication, and (iii) Alice and Bob are allowed two-way public communication. Distri¬ 
butions having a key rate of the conditional mutual information are precisely those in which a 
"helping" Eve offers Alice and Bob no greater advantage for obtaining secret key than a fully 
adversarial one. For each of the above scenarios, strong necessary conditions are derived on the 
structure of distributions attaining a secret key rate of I(X : Y|Z). In obtaining our results, we 
completely solve the problem of secret key distillation under scenario (i) and identify H(S |Z) 
to be the optimal key rate using shared randomness, where S is the Gacs-Korner Common 
Information. We thus provide an operational interpretation of the conditional Gacs-Korner 
Common Information. Additionally, we introduce simple example distributions in which the 
rate I(X : Y|Z) is achievable if and only if two-way communication is allowed. 


1 Introduction 

A basic information-processing task involves the exchange of secret information between Alice 
(X) and Bob (Y) in the presence of an eavesdropper. Eve ( E ). If Alice and Bob have some pre- 
established key that is secret from Eve, then any future message M can be transmitted using the 
key as a one-time pad. Thus, the problem of private communication can be reduced to the prob¬ 
lem of secret key distillation, which studies the extraction of secret key <t>xY ■ qz from some initial 
tripartite correlation Pxyz- Here, <3>xy is a perfectly correlated bit and c]y is an arbitrary distribu¬ 
tion. Often, the correlations Pxyz are presented as a many-copy source p\ Y Z' ant ^ Alice and Bob 
wish to know the optimal rate of secret bits per copy that they can distill from this source. 

It turns out that Alice and Bob can often enhance their distillation capabilities by openly dis¬ 
closing some information about X and Y through public communication [AC93, Mau93]. In gen¬ 
eral, Alice and Bob's communication schemes can be interactive with one round of communication 
depending on what particular messages were broadcasted in previous rounds. Such interactive 
protocols are known to generate higher key rates than non-interactive protocols, at least in the ab¬ 
sence of "noisy" local processing by Alice and Bob [Mau93]. Thus, for a given distribution Pxyz, 
one obtains a hierarchy of key rates pertaining to the respective scenarios of no communication. 
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one-way communication, and two-way (interactive) communication. It is also possible to con¬ 
sider no-communication scenarios in which Alice and Bob have access to some publically shared 
randomness that is uncorrelated with their primary source Pxyz■ Clearly publically shared ran¬ 
domness is a weaker resource than public communication since the latter is able to generate the 
former. However, below we will prove even stronger that publically shared randomness offers no 
advantage whatsoever for secret key distillation. 

For the one-way communication scenario, a single-letter characterization of the key rate has 
been proven by Ahlswede and Csiszar [AC93]. When the unidirectional communication is from 
Alice to Bob, we denote the key rate by ~jf(X : Y||Z), while K(X : Y||Z) denotes the rate when 
communication is from Bob to Alice only. No formula is known for the two-way key rate of a 
given distribution, which we denote by K(X : Y||Z), and the complexity of protocols utilizing 
interactive communication makes computing this a highly challenging open problem. 

In the special case of an uncorrelated Eve in Pxyz, the key rate is given by the mutual in¬ 
formation f(X : Y), and this can be achieved using one-way communication. For more general 
distributions in which Eve possesses some side information of XY, the conditional mutual in¬ 
formation I(X : Y|Z) is a known upper bound for the key rate under two-way communication 
[AC93, Mau93]. In general this bound is not tight [MW99]. Rather, the conditional mutual in¬ 
formation quantifies the key rate when Eve helps Alice and Bob by broadcasting her variable Z. 
Key obtained by a helping Eve is also known as private key [CNOO], and private key is still secret 
from Eve even though she helps Alice and Bob obtain it. The relevance of private key naturally 
arises in situations where Eve functions as a central server who helps establish secret correlations 
between Alice and Bob. Thus, distributions with a secret key rate equaling the private key rate of 
f(X : Y|Z) are precisely those in which nothing is gained by a helping Eve. 

The objective of this paper is to investigate the types of distributions for which I(X : Y|Z) 
is indeed an achievable secret key rate. This will be considered under the scenarios of (i) publi¬ 
cally shared randomness but no communication, (ii) one-way communication, and (iii) two-way 
communication. A full solution to the problem would involve a structural characterization of the 
distributions Pxyz whose key rates are I(X : Y|Z). We are able to fully achieve this only for the 
no-communication setting, but we nevertheless derive strong necessary conditions for both the 
one-way and the two-way scenarios. In the case of one-way communication, our condition makes 
use of the key-rate formula derived by Ahlswede and Csiszar. For the statement of this formula, 
recall that three variables A, B, and C satisfy the Markov chain A — B — C if C is conditionally 
independent of A given B; i.e. p(c\b,a) = p(c\b) for letters in the range of A, B, and C. Then, 

Lemma 1 ([AC93]). For distribution Pxyz , 

1(X:Y\\Z)= max I(K : Y\U) - I(K : Z\U), (1) 

KU\XYZ 

where the maximization is taken over all auxiliary variables K and U satisfying the Markov chain KU — 
X — YZ, with K and U ranging over sets of size no greater than \X\ + 1. In particular, 

X(X:Y\\Z)> I(X:Y)-I(X:Z). (2) 

In this paper, we consider when variables KU can be found that satisfy both KU — X — YZ 
and I(K;Y\U) — I(K;Z\U) = /(X : Y|Z). Theorem 2 below offers a necessary condition on the 
structure of distributions for which this is possible. Turning to the scenario of two-way com¬ 
munication, we utilize the well-known intrinsic information upper bound on K(X : Y||Z). For 
distribution Pxyz, its intrinsic information is given by 

I(X : Y | Z) := min I(X : Y|Z) (3) 

Z|Z 
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where the minimization is taken over over all auxiliary variables Z satisfying XY — Z — Z, with Z 
having the same range as Z [CRW03]. Thus, the intrinsic information is the smallest conditional 
mutual information achievable after Eve processes her variable Z. The intrinsic information satis¬ 
fies K(X : Y\\Z ) ^ I(X : Y [ Z ). In Theorem 3 below, we identify a large class of distributions for 
which a channel Z|Z can be found satisfying I(X : Y |Z) < I(X : Y|Z). This allows us to derive a 
necessary condition on distributions having K(X : Y ||Z) = /(X : Y|Z). 

A brief summary of our results is the following: 

• For publically shared randomness with no communication, we identify H(/xy|Z) as the 
secret key rate, where Jxy is the Gacs-Korner Common Information of Alice and Bob's 
marginal distribution pxY- Moreover, this rate is achievable without using shared random¬ 
ness. Using this result, the structure of distributions attaining /(X : Y|Z) can easily be 
characterized. 

• When one-way communication is permitted between Alice and Bob, we show that the dis¬ 
tribution Pxyz must satisfy a certain "block-like" structure in order to obtain the key rate 
I(X : Y|Z). Specifically, given some outcome z of Eve, if there exists collections of events 
Xq and To for Alice and Bob respectively that satisfy pfYol Ao, z) = p(Ao| Yo,z) = 1, then 
p ( To| Aq) = p (A'q | y\)) = 1; i.e. the conditional probabilities hold regardless of Eve's out¬ 
come. 

• For key distillation with two-way communication, we show that distributions attaining a 
key rate of f(X : Y|Z) must also satisfy a certain type of uniformity similar to the one¬ 
way case. One special class of distributions our necessary condition applies to are those 
obtained by mixing a perfectly correlated distribution pxY with an uncorrelated one such 
that the marginal distributions have the same range and such that Eve's variable Z specifies 
which one of the distributions Alice and Bob hold. We show that unless either Alice or Bob 
can likewise identify the distribution from his or her variable, a key rate of I(X : Y|Z) is 
unattainable. 

• We construct distributions in which a distillation rate of I(X : Y|Z) is unachievable when 
the communication is restricted from Alice to Bob, and yet it becomes achievable if the com¬ 
munication direction is from Bob to Alice. We further provide an example when /(X : Y|Z) 
is achievable only if two-way communication is used. To our knowledge, these are the first 
known examples rigorously demonstrating such communication dependency for optimal 
key distillation. We then turn to the difference between single-party key extraction versus 
shared key extraction by public communication. We completely characterize the distribu¬ 
tions in which the latter can be accomplished at the same rate as the former. 

Before presenting these results in greater detail, we begin in Section 2 with a more precise 
overview of the key rates studied in this paper. In Section 3, we then present the Gacs-Korner 
Common Information and prove some basic properties. Section 4 contains our main results, with 
longer proofs postponed to the appendix. Finally, Section 5 offers some concluding remarks. 

2 Definitions 

Let us review the relevant definitions of secret key rate under various communication scenarios. 
We consider random variables X, Y and Z ranging over finite alphabets X, y, and Z respectively. 
For a general distribution q, we say its support (denoted by supp[q\) is the collection of x such that 
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q(x) > 0. In all distillation tasks, we assume that Alice and Bob each have access to one part of 
an i.i.d. (identical and independently distributed) source XYZ whose distribution is Pxyz■ Hence, 
after n realizations of the source, X”, Y n and Z" belong to Alice, Bob, and Eve respectively. In 
addition, Alice and Bob each possess a local random variable, Qa and Qb respectively, which are 
mutually independent from each other and from X n Y n Z n . This allows them to introduce local 
randomness into their processing of X n Y n . 

We first turn to the most restrictive scenario, which is key distillation using publicly shared 
randomness. The common randomness (c.r.) key rate of X, Y , and Z, denoted by K cr (X : Y||Z), is 
defined to be the largest R such that for every e > 0, there is an integer N such that n y N implies 
the existence of (a) a random variable W independent of X n Y n Z n and ranging over some set VV, 
(b) a random variable K ranging over some set 1C, and (c) a pair of mappings f(X n , Qa, W) and 
g(Y n , Qb, W) for which 

(i) p r [f = g = K]> 1-e; 

(ii) log \IC\ — H(K\Z n W) < e; 

(hi) ±log|/C| >R. 

We next move to the more general scenario of when Alice and Bob are allowed to engage in 
public communication. A local operations and public communication (LOPC) protocol consists of a 
sequence of public communication exchanges between Alice and Bob. The i th message exchanged 
between them is described by the variable M,. If Alice (resp. Bob) is the broadcasting party in 
round i, then M t is a function of X” and Qa (resp. Y n and Qb) as well as the previous messages 
(Mi, M 2 , • • • , M;_ 1 ). The protocol is one-way if there is only one round of a message exchange. 

For distribution Pxyz, the Alice-to-Bob secret key rate f\ (X : Y ||Z) is the largest R that satisfies 
the above three conditions except with W being replaced by some message M that is generated 
by Alice and therefore a function of (X”, Qa)- We can likewise define the Bob-to-Alice key rate 
K(X : Y\\Z). The (two-way) secret key rate of X and Y given Z, denoted by K(X : Y||Z), is de¬ 
fined analogously except with M = (Mi, M 2 , • • • ,M r ) being any random variable generated by 
an LOPC protocol [Mau93, AC93]. The key rates satisfy the obvious relationship: 

K c - r '(X : Y||Z) sc (1(X : Y||Z),^(X : Y||Z)} sS K(X : Y||Z). (4) 

3 The Gacs-Korner Common Information 

In this section, we introduce the Gacs-Korner Common Information. For every pair of random 
variables XY, there exists a maximal common variable Jxy i n the sense that Jxy is a function of both 
X and Y, and any other such common function of both X and Y is itself a function of Jxy- Hence, 
up to relabeling, the variable Jxy is unique for each distribution pxy- I n terms of its structure, a 
distribution pxY can always be decomposed as 

P(x,y) = ^ P( X 'V\j)P(j)' ( 5 ) 

JxY=j 

where for any x, x' e X and y,y’ e y, the conditional distributions satisfy p(x,y\j)p(x,y'\j') = 0 
and p(x,y\j)p(x',y\f) = 0 if y =(= f. Gacs and Korner identify H(Jxy) as the common information 
of XY [GK73]. 

It is instructive to rigorously prove the statements of the preceding paragraph. A common 
partitioning of length t for XY are pairs of subsets (A), M);=i such that 
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(i) Xi n Xj = yi = 0 for i =(= /, 

(ii) p(Xi\yj) = p(yi\Xj) = Sij, and 

(iii) if (x,y) e X t x y { for some i, then px(x)py(y) > 0. 

For a given common partitioning, we refer to the subsets Xj x y^ as the "blocks" of the partitioning. 
The subscript i merely serves to label the different blocks, and for any fixed labeling, we associate 
a random variable C(X, Y) such that C(x,y) = i if (x,y) e Xj x y). Note that each party can 
determine the value of / from their local information, and it is therefore called a common function 
of X and Y. A maximal common partitioning is a common partitioning of greatest length. The 
following proposition is proven in the appendix. 

Proposition 1. 

(a) Every pair of finite random variables XY has a unique maximal common partitioning, which we 
denote by J x y, 

(b) Variable Jxy satisfies 


H(Jxy) = max{H(K) : 0 = H(K|X) = H(K\Y)} 

K 

iff Jxy is a common function for the maximal common partitioning ofXY. 

(c) Iff(X) = g(Y) = C is any other common function ofX and Y, then C(Jxy)- 

With property (a), we can speak unambiguously of the maximal common partitioning of a 
distribution p X Y- Consequently the variable Jxy is unique up to a relabeling of its range. The 
following proposition provides a useful characterization of values x and x' that belong to the 
same block in a maximal common partitioning. 

Proposition 2. If Jxy( x ) = Jxy{ x ') fo r x > x ' 6 Jxy> then there exists a sequence of values 

xyixiyixi ■ ■ • y n x' 

such that p(x,yi)p(yi,xi)p(xi,y 2 ) ■ ■ • p{y n ,x') > 0. 

Proof See the appendix as well as [GK73]. □ 

4 Results 

4.1 Key Distillation Using Auxiliary Public Randomness 

The Gacs and Korner Common Information plays a central role in the problem of key distillation 
with no communication. To see a preliminary connection, we recall an operational interpretation 
of H(Jxy ) that Gacs and Korner prove in Ref. [GK73]. The task involves Alice and Bob construct¬ 
ing faithful encodings of their respective sources X and Y, and H(Jxy) quantifies the asymptotic 
average sequence-length of codewords per copy such that both Alice and Bob's encodings output 
matching codewords with high probability over this sequence [GK73] . 

For the task of key distillation, Alice and Bob are likewise trying to convert their sources into 
matching sequences of optimal length. However, the key distillation problem is different in two 
ways. On the one hand there is the additional constraint that the common sequence should be 
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nearly uncorrelated from Eve. On the other hand, unlike the Gacs-Korner problem, it is not re¬ 
quired that these sequences belong to faithful encodings of the sources X and Y. Nevertheless, 
we find that H(/xy|Z) quantifies the distillable key when Alice and Bob are unable to communi¬ 
cate with one another. This is also the rate even if Alice and Bob have access to auxiliary public 
randomness which is uncorrelated with their primary distribution. 

Theorem 1. K cr (X : Y ||Z) = H(Jxy |Z). Moreover , H(Jx y|Z) is achievable with no additional common 
randomness. 

Proof. Achievability: We will prove that H{Jxy |Z) is an achievable rate without any auxiliary 
shared public randomness (i.e. W is constant). For n copies of Pxyz, Alice and Bob extract their 
common information from each copy of Pxyz- This will generate a sequence of ff Y , with Alice 
and Bob having identical copies of this sequence. It is now a matter of performing privacy am¬ 
plification on this sequence to remove Eve's information [BBCM95]. The main construction is 
guaranteed to exist by the following lemma. 

Lemma 2 (See Corollary 17.5 in [CK11]). For an i.i.d. source of two random variables Jxy an d Z with 
Jxy ranging over set J , for any 5 > 0 and k < 2”[ H ^ xr l z ^ _,5 l, there exists an e > 0 and a mapping 
k : J n —► JC = {1,2, • • • ,k] such that 

log\)C\-H(K(J n xY)\Z n )<2- ne . 

From this lemma, it follows that H(/xy|Z) is an achievable key rate. 

Converse: The converse proof follows analogously to the converse proof of Theorem 2.6 in Ref. 
[CNOO] (see also [CK11]). We will first prove the converse under the assumption of no local 
randomness (i.e. Qa and Qg are constant). We will then show that adding local randomness 
does not change the result. Suppose that X cr (X : Y||Z) = R. We consider a slightly weaker 
security condition than the one presented in Sect. 2. This is done by replacing (ii) with (ii'): 
jj(log |/C| — H(K\Z n W)) < £. Under the weaker condition, (i) implies that 

^\H(f\Z n W) - H(K\Z n W)\ ^ ^max{H{f\KZ n W), H(K\fZ n W)} 

^max{H(/|X),H(K|/)} 

<l(h( £ )+e( log|/C|-l)), (6) 

where the last line follows from Fano's Inequality. Hence, under the assumption of the original se¬ 
curity condition, ^(log \IC\ - H(f\Z n W )) < e + 0(|). This means that, without loss of generality, 
K can be assumed to be a function of (X”, Qa, W); i.e. K = f{X n , Qa, W). Then, for every 5,e > 0 
and n sufficiently large, there exists a random variable W independent of X n Y n Z n along with func¬ 
tions/(X”, W) and g(Y n , W) satisfying (i) Pr\f = g = K\ > 1-e, (ii') i(log |/C| - H(K|Z"W)) < e 
and (iii) ^ log |/C| ^ R. 

Note that from (i) in the security condition, Fano's Inequality together with data processing 
gives 

H(K\Y n W) < h{e) + e(log |/C| — 1). (7) 

Combining this with (ii') gives 

^(1 - e) log |/C| < ^[H(K\Z n W) - H(K\Y n W ) + h(e) - e], 
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and so 


R ^ - log|/C| + 5 < —• - \H(K\Z n W) - H(K\Y n W)} + ^—- -- + S. (8) 

n 1 — en 1 - e n 

To analyze the quantity H(K\Z n W ) — H(K\Y ri W), we will use a standard trick. 

Lemma 3. Let ] be uniformly distributed over the set {1, • • • ,n} and let denote the i th instance of 
A in A n . Likewise, let A^P = A^P • • • A^~P and A = A^ + P ■ ■ ■ A ( P with A^ K P := 0 and 
A( n+ 1) := 0. Then for random variables P and Q and sequences of random variables A n , B n 

H{P\A n Q) - H(P\B n Q ) = n[I(P : B^\TQ) - I{P : A^\TQ)}, (9) 

where T = JA^PB^P 

Proof. See, e.g., proof of Lemma 17.12 in [CK11], □ 

Then we can use Lemma 3 to obtain 

H(K\Z n W) - H(K\Y n W) = n[I(K : Y^\UW) - I(K : Z { P\UW)}, (10) 

where U := JY^P Z^P. Notice that for any i e {1, ■ ■ ■ , n} we have 

x( <i )x( >I ')y( <I ')z( >1 ') — x® — y ^ z ^\ (ii) 

since the sampling is i.i.d.. Therefore, because K is a function of (X”, W), we have 

KU-X^W-Y^Z^. (12) 

Removing the superscript “J" and taking e, 6 —> 0, we have the bound 

R I(K : Y\UW) ~I(K: Z| UW) (13) 

such that KU — XW — YZ. 

Next, Eq. (7) gives 

h(e) + e(log \K\ - 1) > H(K\Y n W) - H(K\X n W) 

= n[I(K : X (/) | JY^X^PW) - I(K : Y (/ )|JY (</) X (>/) W)], (14) 


where the first inequality follows because H(K\X n W) is nonnegative and the quality follows from 
Lemma 3. We want to put this in terms of U. To do this, note that 

I(K : XM| JY^PX>PW) = I(XY (</) X (>/) : X (/) | JW) 

= I(KY^PX (> PZ (> P : X ( P\JW) - I(Z (> P : X (/) |/XY (</) X( >J) W) 

= I(KUX (> P : X ( PjJW) 

= I(KU : X(PjJW) + I(X>P : X^| KUW), (15) 

where the first equality follows from the chain rule and KY^PX^P : X^P\JW) = 0, and in the 
second equality 

I(Z>P : X ( P\JKY^PX^PW) sc I(Z>P : KX^P\JY^Px>Pw) 

= I(Z^P : X (/) |/Y( <J) X (>/) W) (16) 

= 0 . 
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The first equality (16) uses I(Z>n : K\JY^X^W) = 0 since K - JY^X^W - Z(>-0 is a 
Markov chain. Again this follows from the basic Markov condition K — WX n — Y n Z n and the sam¬ 
pling is i.i.d.. The second equality follows from i.i.d. sampling and W independence of X n , Y n , Z n . 
A similar analysis likewise gives 

I(K : y(/)|/y( < f)x( >/) W) - I(KU : Y iJ) \JW ) + 7(X (>/) : Y^ J) \KUW) 

sc I(KU : Y^\JW) + I(X (>/) : X^| KUW), (17) 

where the inequality follows from the Markov condition 

x( >J ) -kux^w -y (/ ), 

which can be derived from the more obvious Markov condition 

Kux n -jx^w -y (/) . 


Putting everything together yields 

h(e) + e(log |/C| - 1) > H{K\Y n W) - H(K\X n W ) 

> I(KU : X^| JW) - I(KU : Y^\JW) 

= I(KU : X (/) y^|/W) - I(KU : y (/ )|/X (/) W) - I(KU : Y®\JW) (18) 
= I(KU : X^\JY^W) + I(KU : Z^\JY^X^W) (19) 

= I(KU : X^Z^\JY^W), 

where the second term in (18) is zero from the already proven Markov chain KU — XW — YZ, and 
in (19) we use the fact that I(KU : Z^\JY^X^W) = 0. Removing the superscript "J" and taking 
e —> 0 necessitates the Markov chain KU — YW — XZ. 

The double Markov chain K — XW — Y and K — YW — X implies that I(K : Xy|JxyW) = 0 (see 
Proposition 4 below). Since K is a function of (X, W), we have that H(K\Jxy W) = 0. Thus, K must 
also be a function of ( Y , W). Continuing Eq. (13) gives the bound 

R^I(K: Y\UW ) - I(K : Z\UW) 

= H(K\UW)-I(K : Z\UW) 

= H(K\ZUW) ^ H(K\ZW). (20) 

We have therefore obtained the following: 

R^maxH{K\ZW), (21) 

where the maximization is taken over all variables K such that H(X|XW) = H(K\YW) = 0. 

This can be further bounded by using the following proposition. 

Proposition 3. IfW is independent of XY and H(K\XW) = H(K\YW) = 0, then K is a function of 

( JxY,W ). 

Proof. The fact that H(X|XW) = H(K\YW) = 0 implies the existence of two functions /(X, W) 
and g(Y,W) such that Pr[f(X,W ) = g(Y, W)] = 1. Consequently, if p{x\,y\)p{x\,yf) > 0, then 
f{x\,w) = g{y\,w) = g{yi,w) for all w e W with p(w) > 0. Indeed, if, say, f{x\,w) =|= g(yi,zv), 
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then Pr[f(X,W) =j= g(Y, W)] ^ p(xi,yi,w) = p(xi,x/ 2 )p(zv) > 0 , where we have used the in¬ 
dependence between XY and W. By the same reasoning, p(x\,yi)p(yi,X2) > 0 implies that 
f(x\,w) = f{x2,w) = g(y\,w) for all w e W. Turning to Proposition 2, if Jxy(x) = Jxy(x'), then 
there exists a sequence xy\X\y2X2 • • • y n x' such that p[xy\)p{\j\X\)p{x\\y2) • • • p{y n x') > 0 . There¬ 
fore, as just argued, we must have that f(x,zv ) = f{x',w ) for all w e W. Hence K must be a 
function of (/xy, W). □ 

We now apply Proposition 3 to Eq. (21). Suppose that K obtains the maximization in Eq. (21). 
Then, since K is a function of (Jxy, W), we have that 

H(K\ZW) < H(/ xy W|ZW) = H(/xy|ZW) < H(/ X y|Z). (22) 

This proves the desired upper bound under no local randomness. 

To consider the case when Alice and Bob have local randomness Qa and Qg, respectively, 
define X := (X, Q A ) and Y := (Y,Qi-i)- Then repeating the above argument shows that R X 
H{Jxy\Z)■ It is straightforward to show that with Qa and Qg pairwise independent and indepen¬ 
dent of XY, we have Jxy = Jxy- 

We complete the proof by giving the Double Markov Chain Proposition used to obtain equa¬ 
tion (20) above. 

Proposition 4 (Conditional Double Markov Chains (also Exercise 16.25 in [CK11])). Random vari¬ 
ables WXYZ satisfy the two Markov chains X-YZ-W and Y-XZ-IN ijflQXY : W|/ XY | Z Z) = 0. 

Proof. If I(XY : W|/ XY | Z Z) = 0 then I(Y : W|/ XY | Z Z) = 0. The Markov chain X-YZ-W follows 
since 


I(XY : W|/ XY , z Z) = I(X : W|Y/ XY | Z Z) + I(Y : W|/ XY | Z Z) 
= I(X:W\YZ) + I(Y:W\JxyizZ), 


where we have use the fact that Jxy\7 is a function X and Y when given Z. A similar argument 
shows that Y-XZ-W. 

On the other hand, if the two Markov chains hold, then whenever PxyzX, y, z > 0, we have 


p(W = w\x,y,z) = p(w\x,z) = p(w\y,z). 


(23) 


Hence, the conditional distribution p(w\x,y,z) is constant across each block X, x y, in the maximal 
common partitioning of P XY | Z=z . Consequently, 


PW|XYZ — Pw|J xr | Z Z/ 

and so for any / XY | Z = j and Z = z for which p(j,z ) > 0, we have 

p(x,y,w\j,z) = p(w\x, y, j,z)p(x,y\j,z) 

= p(w\x, y,z)p(x,y\j,z) = p(w\j,z)p(x,y\j,z). (24) 

Thus, I(XY : W|/ XY | Z Z) = 0. □ 

□ 
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Figure 1: Examples of a distribution that is not uniform block (a) and one that is (b). Each entry corresponds 
to a conditional probability value p(x,y\z). UB distribution (b) is not uniform block independent (UBI) since 
the block in the Z = 1 plane contains correlations between Alice and Bob. 


In Ref. [CFH14] we have studied a related quantity known as the maximal conditional common 
function Jxy\Zr which is the collection of variables {Jxy\z=z '■ z e Z} with Jxy\z=z being a maximal 
common function of the conditional distribution Pxy\z=z ■ The variable Jxy\z is again unique for 
every distribution p xyz U P to relabeling. Since Jxy\z=z is computed from both X and Y with the 
additional information that Z = z, maximality of Jxy\z=z ensures that Jxy is a function of Jxy|z=z 
for each z e Z. In other words, a labeling of Jxy and /xy|z ca n be chosen so that Jxy is a coarse- 
graining of / X y|z- Therefore, H(/ xy |Z) sj H(/ XY | Z |Z) with equality iff H(/ XY | Z |Z/ Xy ) = 0. When 
the equality condition holds, it means that for each zeZ, the value of / xy i z=z can be determined 
from Jxy alone. Hence, the variables Jxy and / xy | Z must be equivalent up to relabeling. From this 
it follows that a distribution satisfies H(/ xy | Z |Z/ xy ) = 0 iff it admits a decomposition of 

p{x,y,z) = Yj P( x ^j\z,j)p{j\z)p(z), (25) 

JxY=j 

where for any x, x' e X, y, y' e y and z,z! e Z the conditional distributions satisfy 

P(x,y\z,j)p(x,y'\z',f) = 0, p{x,y\j)p{x! r y\z!,j') = 0 if ;'=(=/. 

The class of distributions of this form we shall call uniform block (UB) (see Fig. 1). 

The quantity H(/ xy | Z |Z) is the private key rate when Eve is helping by announcing her vari¬ 
able, yet Alice and Bob are still prohibited from communicating with one another. Thus, the 
difference H(/ xy | Z |Z) — H(/ xy |Z) quantifies how much Eve can assist Alice and Bob in distill¬ 
ing key when no communication is exchanged between the two. From the previous paragraph, 
it follows that Eve offers no assistance (i.e. the private key rate equals the secret key rate) in the 
no-communication scenario iff the distribution is UB. 

Returning to Theorem 1, we can now answer the underlying question of this paper for no¬ 
communication distillation. By using the chain rule of conditional mutual information and the 
fact that Jxy is both a function of X and Y, we readily compute 

I{X : Y|Z) = I(JxyX : Y|Z) = I (Jxy : Y|Z) - I(X : Y|Z/ xy ) 

= H(/ xy |Z) - f(X : Y|Z/ xy ). (26) 
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The conditional mutual information is thus an achievable rate whenever I (X : Y\ZJxy) = 0. 
Distributions satisfying this equality are uniform block with the extra condition that p(x,y\z,j) = 
p(x\z,j)p(y\z,j) in Eq. (25). We shall call distributions having this form uniform block independent 
(UBI). Putting everything together, we find that 

Corollary 1. A distribution Pxyz satisfies K cr (X : Y ||Z) = I(X : Y |Z) if and only if it is uniformly 
block independent. 

Remark. The no-communication results discussed above and proven in the appendix are already 
implicit in the work of Csiszar and Narayan. In Ref. [CN00], they study various key distillation 
scenarios with Eve functioning as a helper and limited communication between Alice and Bob. 
Included in this is the no-communication scenario with and without helper. However, being very 
general in nature, Csiszar and Narayan's results involve optimizations over auxiliary random 
variables, and it is therefore still a non-trivial matter to discern Theorem 1 and Corollary 1 directly 
from their work. Additionally, they do not consider the scenario of just shared public randomness. 

4.2 Obtaining I(X : Y\Z) with One-Way Communication 

In this section we want to identify the type of tripartite distributions from which secret key can be 
distilled at the rate I(X : Y |Z) using one-way communication. Since fC(X : Y |Z) < I(X : Y\Z ), our 
analysis deals with distributions for which one-way communication suffices to optimally distill 
secret key. Manipulating Eq. (1) of Lemma 1 allows us to determine when K (X : Y ||Z) = /(X : 
Y|Z). We have that 

I(K : Y\U) — I(K : Z| U) = I(K : Y\ZU) - I{K : Z\YU) 

= I(KU : Y|Z) - I(U : Y|Z) - I(K : Z| YU) 

= I(X : Y|Z) - I(X : YjKUZ) - I(U : Y|Z) - I(K : Z\YU). (27) 

From this and Lemma l, we conclude the following. 

Lemma 4. Distribution Pxyz has f?(X : Y||Z) = I(X : Y|Z) iff there exists variables KUXYZ with K 
and U ranging over sets of size no greater than \X\ + 1 such that 

(1) KU-X-YZ, (2) X-KUZ-Y, 

(3) U-Z-Y, (4) K-YU-Z. (28) 

The conditions of Lemma 4 allow for the follow rough interpretation. (1) says that Alice is able 
to generate variables K and U from knowledge of her variable X. We think of K as containing the 
key that Alice and Bob will share and U as the public message sent from Alice to Bob. (2) says 
that from Eve's perspective, Alice and Bob share no more correlations given U and K. Likewise, 
(3) says that from Eve's perspective, the public message is uncorrelated with Bob. Finally, (4) says 
that after learning U, Bob can generate the key K that is independent from Eve. 

Unfortunately, Lemma 4 does not provide a transparent characterization of the distributions 
for which K (X : Y| |Z) = I (X : Y|Z). We next proceed to obtain a better picture of these distribu¬ 
tions by exploring additional consequences of the Markov chains in Eq. (28). The following places 
a necessary condition on the distributions. We will see in Section 4.4, however, that it fails to be 
sufficient. 
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Theorem 2. If distribution p XYZ has either t(X : Y\\Z) = 1(X : Y\Z) ort(X : Y||Z) = I(X : Y|Z), 
then p XYZ must have the following property: For any z e Z, if x y\ and Ay x Yj are tzvo distinct blocks 
in the maximal common partitioning ofp XY \ z=z , then 

PxY{Xi,yj) = o. 

Proof Without loss of generality, assume that K (X : Y||Z) = I(X : Y|Z). For distribution 
Pxy |z=z with maximal common partition Aa)a=i, consider arbitrary (x ; ,i/ ; ) e A, x A; and 
{xj, yf) e Ay x Yj- Note that from the definition of a maximal common partitioning, we have that 
p(xj,z)p(yi,z) > 0, but we need not have that p(xj,yi,z ) > 0. 

We will prove that p(xj, yj,z') = 0 for all z! e Z (clearly this already holds when z! = z). 
Suppose on the contrary that /;(x ^ •, 1 /y,z , ) > 0. Since p(xj,z) > 0, there will exist some y\ e Yi 
such that p{xi,y' ir z) > 0. Then the Markov chain condition KU — X — YZ implies that for some 
(. k , u) e /C x U such that p(k,u\xi ) > 0, we have 

p(k,u\xi) = p(k, u\xi,y' ir z) = p(k,u\x ir yj r z') > 0. (29) 

Eq. (29) implies that both p(k,u\y' i ,z) > 0 and p(k,u\yj,z') > 0. From p(u\y' ir z) > 0 and the 
Markov chain U — Z — Y, we have that p(u\yj,z) > 0. Then we can further derive 

0 < p(k,u\yj,z!) = p(u\yj,z')p(k\u,yj,z') 

= piu\yj,z')p(k\u,yj,z) 

=> p(k\u,yj,z) > 0, 

p{k,u\yj,z) = p(k\u,yj,z)p(u\yj,z) > 0, (30) 

where we have used the Markov chain K — YU — Z. From the last line, we must be able to find 
some x'j e Ay such that p(xy,yy,z) > 0 and p(k,u\x'j,yj,z) > 0. Inverting probabilities gives that 
both p{x'-,yj\k, u,z ) >0 and p{xi,y' i \k, u,z) > 0. Hence, 

I(X : Y\KUZ) = I(J XY , Z X : Y\KUZ) 

= I(X : Y\J xy{z KUZ) + ^ H{JxY\z=z\k,u,z)p[k,u,z) > 0, (31) 

k,u,z 

since H(j XY \ z=z \k,u,z) > 0 because {x u y' i ) e A) x Y t and (x', iff) e Ay x Ay- However, this strict 
inequality contradicts the Markov chain condition X — KUZ — Y. □ 

Figure 2 (a) provides an example distribution which does not satisfy the necessary conditions 
of Theorem 2 for I(X : Y|Z) to be an achievable one-way key rate. On the other hand. Figure 2 (b) 
depicts an distribution for which the conditions of the theorem are met. However, Theorem 3 in 
the next section will show that both distributions (a) and (b) have K(X : Y| \Z) < I(X : Y|Z). 

4.3 Obtaining I(X : Y|Z) with Two-Way Communication 

We now turn to the general scenario of interactive two-way communication. Our main result is 
the necessary structural condition of Theorem 3. Its statement requires some new terminology. 

For two distributions p XY and q XY over A x Y, we say that q XY < p XY if, up to a permutation 
between X and Y, the distributions satisfy supp[q x \ <= supp[p x ] and one of the three additional 
conditions: (i) q XY is uncorrelated, (ii) supp[q Y \ ci supp[p Y ], or (iii) y e supp[q Y ]\supp[p Y ] implies 
that //( X Y- l/) -0. 
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Figure 2: (a) The conditions for a one-way key rate of I (X : Y|Z) given by Theorem 2 are violated for this 
distribution. To see this, note that the events (X — 1, Y = 2) and (X — 2, Y = 1) are both possible when 
Z — 1. Hence, Theorem 2 necessitates p(l, 1) = 0, which is not the case because of the plane Z — 0. 
Distribution (b) lacks this characteristic and therefore it satisfies the conditions of Theorem 2. 


Theorem 3. Let Pxyz be a distribution over At x y x Z such that Pxy|z=z! * PxY\z=z 0 f or some zq, Z\ e Z. 
If there exists some pair (x,y) e supp[p x \z=o] x su PP[PY\z=o]f or which p(x,y\z\) > 0 but p(x,y\zo) = 
0, then K (X : Y Z ) < 1(X : Y|Z). 

Proof. The proof will involve showing that there exists a channel Z|Z such that I(X : Y|Z) < I(X : 
Y|Z). The channel will involve mixing Zq and Zi but leaving all other elements unchanged. Define 
the function 

/(f) = J(X: r) ( ,-, w =,+« P „ |z «, 'dual, (32) 

which gives the mutual information of the mixed distribution (1 — f)pxY|z=z 0 + fpxy|z=z, • The 
function / is continuous and twice differentiable in the open interval (0,1). To prove the theorem, 
we will need a simple general fact about functions of this sort. 

Proposition 5. Suppose that f is a continuous function on the closed interval [0,1] and twice differentiable 
in the open interval (0,1). Suppose there exists some 0 < 5 < 1 such that f is strictly convex in the interval 
1= ((),£] and /(l) —/(0) > f (t) for all t e 1. Then f{t) < (1 - f)/(0) + tf{l)for all t e 1. 

Proof Introduce the linear function g(t) = (1 — f)/(0) + tf( 1). Note that by assumption we have 
g'(t) > f'(t) for t e At. We want to show that /(f) < g(t) for tel. We have 

*(0 = (1 - M0) + jg(S) > (1 - f)/(0) + I m > /(f). (33) 

Here, the first inequality follows from the facts that /(0) = y(0) and 0 > g'(t) > f'(t ) for f el (so 
g(5) > f (5))) and the second inequality uses the strict convexity of f ini. □ 

Continuing with the proof of Theorem 3, it will suffice to show that the function given by 
Eq. (32) satisfies the conditions of Proposition 5. For if this is true, then we can argue as follows. 
Choose e sufficiently small so that e (0,/, where 5 is described by the proposition. 

Define the channel Z|Z by p(zo|zi) = £, p(zi|zi) = 1 — e, and p(z|z) = 1 for all z =|= Zi e Z. 
This means that p(z o) = p(zo) + ep(zi) and p(zi) = (1 — e)p(zi), and inverting the probabilities 
gives p(z 1 |z 1 ) = 1, p(zi|z 0 ) = p(2 jff (zi) / and p(z 0 |z 0 ) = p{z ^+e p{zi) - Since p(x,y|Z = z) = 
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p(x,y|Z = z)p{Z = z\Z = z), the average conditional mutual information is 

S : y l X = ^)p(2) + /( p(zoT+$hi) ) V (zo) + /(I)P(zi) 

Z=t=2o,ZiG^ 

< 2 J(X : Y[Z = z)p(z) + ( p (zo ^ (zi) /(°) + p^T+epV) /! 1 )) P(zo) + /(!)(! - e)p(zi) 

Z=(=Zo,ZiS^ 

= I(x : Y|Z), (34) 


ep(zi) 


has been invoked. 


where Proposition 5 at x = p(zo)+ep(zi) 

Let us then show that the conditions of Proposition 5 hold true for the function given by Eq. 
(32) whenever p X y|z=z, * Pxy\z=z 0 ' i- e - that there exists some interval (0, S\ for which / is strictly 
convex and /(1) — /(0) > We have 


/(f) = - ^ [(1 - t)p (x|z 0 ) + fp(*|zi)] log[(l - f)p(x|z 0 ) + fp(x|zi)] 

xeX 

- 2 [(1 - t)p(y\zo) + tp{y\zi)] log[(l - t)p(y\zo) + tp(y\z 1 )} 
yey 

+ Y Y K 1 - 0p(*wM + fp(x,y|zi)] log[(l - t)p{x,y\z 0 ) + tp{x,y\z 1 )\. (35) 

xg * yey 


We are interested in lim^o f'(t) and lim^o To compute these, we use the fact that the 

function g(t) = (r + sf) log(r + sf) satisfies g'(t) = s(l + log(r + sf)) and g"(t) = We 

separate the analysis into three cases. Without loss of generality, we will assume supp[p x \z=z 1 ] c 
supp[p x |Z=z 0 ]- 

Case (i): Pxy|z=zi is uncorrelated. 

Since supp[p x |z=zj c su PP[Px\z=z 0 ]r we can assume that p(x \zq) =(= 0 for all x; otherwise there 
is no term involving x in Eq. (35). Now suppose that p(y|zo) = 0. Then for this fixed y, the 
summation over x in the third term of Eq. (35) becomes 

2 K 1 - OpOwM + *p{x,y\zi)} log[(l - t)p(x,y\z 0 ) + tp(x,y\zi)} 

xeX 

= t Y p(*|zi)p(y|zi)log[fp(*|zi)p(y|zi)] 

xeX 

= tp(y |zi)log[fp(y|zi)] + tp(y\z 1 ) ^ p(x |zi)log[p(x|zi)]. (36) 

xg* 


Hence, by letting Bi = {y : p(y|zj) > 0} for I e {0,1}, we can equivalently write Eq. (35) as 

/(f) = - 2 [(1 - f)p(*|z 0 ) + *p(*|zi)] log[(l - f)p(x|z 0 ) + tp{x |Zi)] 

xeX 

~ 2 - Op(y|zo) + *p(y|zi)] log[(l - f)p(y|z 0 ) + tp(y\z 1 )j 

yeB 0 

+ s s [(1 - t)p(x,y\z 0 ) + tp{x,y\z l )] log[(l - t)p(x,y|zo) + tp(x,xy\z 1 )} 

yeB 0 xeX 

+ t Y p(l/l z i) Y P(*l z i)!og[p(*|zi)]- (37) 

yeBi\B 0 xeX 
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If p(x,y\zo) = 0 for some (x,y) e X x Bq, then the first derivative of (37) will diverge to —oo as 
t —> 0 while its second derivative will diverge to +oo whenever p(x,y|zi ) > 0. But by assumption, 
there is at least one pair of (x,y) for which this latter case holds. Hence, an interval (0, S] can 
always be found for which Proposition 5 can be applied to /. 

Case (ii): B^Bq = 0- 

This is covered in case (iii). 

Case (iii): ye Bi\Bq => p(y|zi) = p(x y ,y\z\) for some particular x XJ e X. 

The condition p(y|zi) = p(x y ,y\z 1 ) implies that p(x,y\z\ ) = 0 for all x =(= x y . Then similar to 
the previous case, when y e B\\Bq, the summation over x in the third term of Eq. (35) is 

2 fp(x,y|zi)log[fp(x,y|zi)] = fp(* y ,y|zi)log[fp(x y ,y|zi)] 

xeX 

= tp(y\z 1 )\og[tp(y\z 1 )}. (38) 

Hence each term with y e B\\Bq becomes canceled in Eq. (35). Then Eq. (35) reduces to 
/(f) = - 2 [(1 - f)p(x|z 0 ) + fp(x|zi)] log[(l - t)p(x\zo) + fp(z|zi)] 

xeX 

- 2 [(1 - t)p(y\zo) + fp(y|zi)] log[(i - f)p(y|z 0 ) + fp(y|zi)] 

yeB 0 

+ 2 2j K 1 - l )v( x >y l z o) + tp(x,y |z0] log[(l - f)p(x,y|z 0 ) + fp(x,y|zi)]. (39) 

xeX yeB 0 

As in the previous case, the first derivative of this function will diverge to — oo while its second 
derivative will diverge to +oo whenever p(x,y\z\ ) > 0 and p(x,y\zo) = 0. By assumption, such 
a pair (x,y) exists, and so again, an interval (00] can always be found for which Proposition 5 
can be applied to /. Note that when B\\Bq = 0, as in case (ii), Eq. (39) is equivalent to (35). The 
derivative argument can thus be applied directly to (35). □ 

Theorem 3 is quite useful in that it allows us to quickly eliminate many distributions from 
achieving the rate I(X : Y |Z). For example, consider when Pxy\z=z is uncorrelated for some z e Z, 
but Pxy\z=z' is perfectly correlated for some other z' e Z with either supp[p x \z= z \ c su PP[Px |z=z'] 
or supp[p Y \z=z\ c siipp[pY \z=z']- Here, perfectly correlated means that p(x,y|z') = p(x\z')5 Xi y up 
to relabeling. Then from Theorem 3, it follows that !(X : Y|Z) is an achievable rate only if 

P(x,y|z) > 0 => p(x|z')p(y|z') = 0. 

In other words, it is always possible for either Alice or Bob to identify when Z =j= z!. 

Finally, we close this section by comparing Theorems 2 and 3. In short, neither one supersedes 
the other. As noted above, distribution (b) in Fig. 2 satisfies the necessary condition of Theorem 2 
for 1(X : Y ||Z) = I(X : Y|Z). However, Theorem 3 can be used to show that K(X : Y||Z) < I(X : 
Y|Z). This is because Pxy\z=i * Pxy\z=i Y et p(l, 1|2) = 0 while p(l, 111) = 1/3. Therefore its key 
rate is strictly less than I(X : Y |Z). Figure 3 depicts a distribution for which Theorem 3 cannot 
be applied but Theorem 2 shows that X(X : Y||Z) < I(X : Y|Z). The two-way key rate for this 
distribution is still unknown. 
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Figure 3: The event (x,y) = (0,1) has conditional probabilities p(0,1|Z = 0) > 0 and p(0,1|Z = 1) = 0. 
However, we cannot use these facts in conjunction with Theorem 3 to conclude that K(X : Y|| Z) < I(X : 
Y|Z) since the distribution does not satisfy Pxy\z=o * Pxy\z=i (neither sz;pp[p X | Z=0 ] cz supp[p x i z=1 ] nor 
swpp[p Y |z=o] c su PP[Py\z=i])- On the other hand, since p(0,1|Z = 0) > 0, Theorem 2 can be applied to 
conclude that the one-way rate is less than I(X : Y|Z). 


4.4 Communication Dependency in Optimal Distillation 

We next consider some general features of the public communication when performing optimal 
key distillation. Our main observations will be that (i) attaining a key rate of I(X : Y|Z) by 
one-way communication may depend on the direction of the communication, and (ii) two-way 
communication may be necessary in order to achieve the key rate I(X : Y\Z). 

Example (Optimal one-way distillation depends on communication direction). Consider the dis¬ 
tribution depicted in Fig. 4 with I(X : Y|Z) = 1/3. When Bob is the communicating party, a 
protocol attaining this as a key rate is obvious: he simply announces whether or not ye {0,1}. If 
it is, they share one bit, otherwise they fail. Hence, I(X : Y |Z) = 1/3 is an achievable key rate. 

However, the interesting question is whether or not the key rate I(X : Y|Z) is achievable by 
one-way communication from Alice to Bob. We will now show that this is not possible. By Lemma 
4, in order to obtain the rate I(X : Y|Z), there must exist random variables U and V satisfying Eq. 
(28). Assume that such variables exist. If U — Z — Y, then p(u\X = 0)p(w|X = 1) >0 for all U = u; 
otherwise, U and Y couldn't be independent. But then X — KUZ — Y applied to Z = 0 means there 
must exist a pair (k, u) e 1C x U such that 

p(k, u\X = 0) = 0 & p(k, u\X = 1) > 0. 

Hence, 0 = p(k\Y = 2, U = u, Z = 2) < p(k\Y = 2, U = u, Z = 1), which contradicts K — YU — Z. 
Thus t(X : Y||Z) < I(X : Y|Z) = t(X: Y ||Z). 

In this example, notice that if we restricted Eve's distribution to Z = {0,1} (i.e p(Z = 2) = 0), 
then the rate I(X : Y |Z) would indeed be achievable using one-way communication from Alice 
to Bob. This is because without the z = 2 outcome, the Markov Chain X — Y — Z holds. Such a 
result is counter-intuitive since Alice and Bob share no correlations when z e {1,2}. And yet the 
distribution becomes one-way reversible from Alice to Bob when p(Z = 2) = 0, but otherwise it 
is not. 
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Figure 4: A distribution requiring communication from Bob to Alice to achieve a key rate of I (X : Y|Z). 
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Figure 5: Additional outcomes augmented to the distribution of Fig. 4. The enlarged distribution can no 
longer attain a key rate of I(X : Y|Z) unless both parties communicate. 


Example (Optimal distillation requires two-way communication). The previous example can be 
generalized by adding two more outcomes for Eve so that |Z| =5. The additional outcomes are 
shown in Fig. 5 and this is combined with Fig. 4 to give the full distribution. Notice that the 
distribution Pxy|z= 3 is obtained from Pxy|Z=i simply by swapping Alice and Bob's variables, and 
likewise for pxy\z =4 and p X Y|z= 2 - Hence by the argument of the previous example, if Eve were 
to reveal whether or not z e {0,3,4}, then the average Bob-to-Alice distillable key conditioned on 
this information would be less than I(X : Y|Z). Likewise, if Eve were to reveal whether or not z e 
{0,1,2}, then the Alice-to-Bob distillable key conditioned on this information would be less than 
I(X : Y|Z). Thus since the average conditional key rate cannot exceed the key rate with no side 
information, we conclude that I(X : Y|Z) is unattainable using one one-way communication in 
either direction. On the other hand, the distribution is easily seen to admit a key rate of I(X : Y|Z) 
when the parties simply announce whether or not their variable belongs to the set {0,1}. 

5 Conclusion 

In this paper, we have considered when a secret key rate of I(X : Y|Z) can be attained by Alice 
and Bob when working with a variety of auxiliary resources. The conditional mutual information 
quantifies the private key rate of Pxyz, which is the rate of key private from Eve that is attainable 
when Eve helps Alice and Bob by announcing her variable. Therefore, distributions for which 
K(X : Y||Z) = I(X : Y||Z) are those for which no assistance is provided by Eve when she functions 
as a helper rather than a full adversary. 

We have found that with no additional communication, the key rate is I(X : Y|Z) if and only 
if the distribution is uniform block independent. Furthermore, supplying Alice and Bob with 
additional public randomness does not increase the distillable key rate. While this may not be 
overly surprising since the considered common randomness is uncorrelated with the source, it is 
nevertheless a nontrivial result because in general, randomness can serve a resource in distillation 
tasks [AC93, OSS14]. 

Turning to the one and two-way communication scenarios, we have presented in Theorems 2 
and 3 necessary conditions for a distribution to attain the key rate I(X : Y|Z). The conditions we 
have derived are all single-letter structural characterizations, and they are thus computationally 
easy to apply We leave open the question of whether Theorem 3 is also sufficient for attaining 
I(X : Y|Z), although we have no strong reason to believe this is true. Further improvements 
to the results of this paper can possibly be obtained by studying tighter bounds on K(X : Y||Z) 
than the intrinsic information such as those presented in Refs. [RW03] and [GA10]. Nevertheless, 
we hope this paper has shed new light on the problem of secret key distillation under various 
communication settings. 
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7 Appendix 

7.1 Proof of Propositions 1 and 2 
Proposition. 

(a) Every pair of finite random variables XY has a unique maximal common partitioning. 

(b) Variable Jxy satisfies 


H(J X y) = ma x{H{K) : 0 = H(K\X) = H(K\Y)} 

K 

iff Jxy is a common function for the maximal common partitioning ofXY. 

(c) If /(X) = g(Y) = C is any other common function ofX and Y, then C(Jxy)- 

Proof (a) Trivially X x y gives a common partitioning of length one, and any common partition¬ 
ing cannot have length exceeding min{| X\, |3^|}; hence a maximal common partitioning exists. To 
prove uniqueness, suppose that (A’,-,T’/)| =1 and (X/, yi)j =1 are two maximal common partition¬ 
ings. If they are not equivalent, then there must exist some subset, say X- l{] such that X U] 
in which X l{] n XJ 0 for A = 1, ■ • • , K ^ 2. Choose any such XJ from this collection and define 
the new sets R; 0 = X jf) n Xf and R, 0 = XQ. Xf, which are both nonempty since k ^ 2 and the X\ 
are disjoint. However, we also have the properties 

* e X io => p(y io \x) = 1; x e Xf o => p(y' Xo \x) = 1; 

x t *i 0 => p(yio\x) =0; xfXf o ^ p(y Xo \x) = 0. 

(Here we are implicitly using condition (iii) in the above definition by assuming that p(x) > 0 
thereby defining conditional distributions). Therefore, p(S, 0 |R !o ) = p(S; 0 |R, 0 ) = 1 and p(S I(J |K, (| ) = 
p(§i 0 \R io ) = 0, where S io = yi 0 ny' Ao andS; 0 = yi 0 \y' Ao - A similar argument shows that p(R io \S io ) = 
p{Ri 0 \Si 0 ) = 1 and p(R io \S io ) = p(R io \S io ) = 0. Hence, (X ir yiy i+io \J(Si 0 ,Ri 0 )\J(Si 0 ,Ri 0 ) is a com¬ 
mon partitioning of length t + 1. But this is a contradiction since (X u 3^) is a maximal common 
decomposition. 

(b) Suppose that K satisfies 0 = H(K\X) = H(K\Y ) so that K = f{X) = g(Y) for some 
functions / and g. It is clear that / and g must be constant-valued for any pair of values taken from 
same block A; x y, in the maximal common partitioning of XY. Hence the maximum possible 
entropy of K is then attained iff / and g take on a different value for each block in this partitioning. 

(c) Suppose that C is not a function of Jxy■ Then H(CJxy ) > H{Jxy), which contradicts the 

maximality of Jxy■ □ 

Proposition. IfJxY^x) = Jxy(x') for x,x! e Jxy, then there exists a sequence of values 

xy 1 x 1 y 2 x 2 • • • y n x' 

such that p(xyi)p(yixi)p(xiy 2 ) ■ ■ ■ p(y n x') > 0. 


18 


Proof. Define the sets 


s 0 = {x}, Ti = {y- p{y |S 0 ) > 0 } 

Si = {x So : p(x|7i) > 0}, T 2 = {y<£T 1 : p{y\S x u S 0 ) > 0} 

Tn = {yi T„- 1 : p(y\ uj^ S*) > 0}, 

S n = {x $ S n _! : p(x\ u k=1 T k ) > 0}, •••■ (40) 

Since A” and 3^ are finite sets, there must exist some M and N such that Sm+i = 0 and T,v_i = 0. 
Define S = uj^ 0 S k and T = T k . By construction we have p(S\T) = p(T\S ) = 1, and since 
Jxy(x ) = Jxy(x') we must have x, x' e S. However, again by construction, we can always find a 
sequence xiq Xi 1/2*3 ■ • • y n x! with x k e u^ =0 S,- and y k e u ■ : =1 T ! ', and so 

p(^yi)p(yHi)p(^iy 2 ) • • • p{y n x') > o. 


□ 


References 

[AC93] R. Ahlswede and I. Csiszar. Common randomness in information theory and cryptog¬ 
raphy i. secret sharing. Information Theory, IEEE Transactions on, 39(4):1121-1132,1993. 

doi:10.1109/18. 243431. 

[BBCM95] C.H. Bennett, G. Brassard, C. Crepeau, and U.M. Maurer. Generalized privacy am¬ 
plification. Information Theory, IEEE Transactions on, 41(6):1915-1923, 1995. doi: 

10.1109/18.476316. 

[CFH14] Eric Chitambar, Ben Fortescue, and Min-Hsiu Hsieh. A classical analog to entangle¬ 
ment reversibility, 2014. manuscript in preparation. 

[CK11] Imre Csiszar and Janos Korner. Information Theory: Coding Theorems for Discrete Memo¬ 
ryless Systems. Cambridge University Press, Cambridge, UK, 2011. 

[CN00] I. Csiszar and P. Narayan. Common randomness and secret key generation with a 
helper. Information Theory, IEEE Transactions on, 46(2):344-366, 2000. doi : 10.110 9/ 

18. 825796. 

[CRW03] M. Christandl, R. Renner, and S. Wolf. A property of the intrinsic mutual information. 

In Information Theory, 2003. Proceedings. IEEE International Symposium on, pages 258- 
258, June 2003. doi :10.1109/ISIT.2003.122 82 72. 

[GA10] A.A. Gohari and V. Anantharam. Information-theoretic key agreement of multiple 
terminals; part i. Information Theory, IEEE Transactions on, 56(8):3973-3996, 2010. doi : 

10.1109/TIT.2010. 2050832. 

[GK73] P. Gacs and J. Korner. Common information is far less than mutual information. Prob¬ 
lems of Control and Information Theory, 2(2):149,1973. 

[Mau93] U.M. Maurer. Secret key agreement by public discussion from common informa¬ 
tion. Information Theory, IEEE Transactions on, 39(3):733-742, 1993. doi : 10.110 9/ 

18.256484. 


19 


[MW99] 


[OSS14] 


[RW03] 


U.M. Maurer and S. Wolf. Unconditionally secure key agreement and the intrinsic 
conditional information. Information Theory, IEEE Transactions on, 45(2):499-514, 1999. 

doi : 10.1109/18.748999. 

Maris Ozols, Graeme Smith, and John A. Smolin. Bound entangled states with a 
private key and their classical counterpart. Phys. Rev. Lett., 112:110502, Mar 2014. 

doi:10.1103/PhysRevLett.112.110502. 

Renato Renner and Stefan Wolf. New bounds in secret-key agreement: The gap be¬ 
tween formation and secrecy extraction. In Advances in Cryptology EUROCRYPT 2003, 
volume 2656 of Lecture Notes in Computer Science, pages 562-577. Springer Berlin Hei¬ 
delberg, 2003. doi:10.1007/3-540-39200-9_35. 


20 


